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Provided are two plant cDNA clones that 
are homologs of the bacterial CelA genes that 
encode the catalytic subunit of cellulose syn- 
thase, derived from cotton (Gossypium hirsu- 
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gions to these encoding regions to cellulose syn- 
thase. Methods for using cellulose synthase in 
cotton fiber and wood quality modification are 
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PLANT CELLULOSE SYNTHASE AND PROMOTER SEQUENCES 

INTRODUCTION 

Tgrhr^ral Pi pin 

This invention relates to plant cellulose synthase cDNA 
encoding sequences, and their use in modifying plant 
phenotypes. Methods are provided whereby the sequences can be 
used to control or limit the expression of endogenous 
cellulose synthase . 

This invention also relates to methods of using in vitro 
constructed DNA transcription or expression cassettes capable 
of directing fiber-tissue transcription of a DNA sequence of 
interest in plants to produce fiber cells having an altered 
phenotype, and to methods of providing for or modifying 
various characteristics of cotton fiber. The invention is 
exemplified by methods of using cotton fiber promoters for 
altering the phenotype of cotton fiber, and cotton fibers 
produced by the method. 

Background 

In spite of much effort, no one has succeeded in 
isolating and characterizing the enzyme (s) responsible for 
synthesis of the major cell wall polymer of plants, cellulose. 

Numerous efforts have been directed toward the study of 
synthesis of ^cellulose (1, 4 -p-D-glucan) in higher plants. 
However, hampered by low rates of activity in vitro, the 
cellulose synthase of plants has resisted purification and 
detailed characterization (for reviews, see 1,2). Aided by 
the discovery of cyclic-di-GMP as a specific activator, the 
cellulose synthase of the bacterium Acetobacter xylinum can be 
easily assayed in vitro, has been purified to homogeneity, and 
a catalytic subunit identified (for reviews, see 2,3). 
Furthermore, an operon of four genes involved in cellulose 
synthesis in A. xylinum has been cloned (4-7) . 

Characterization of these genes indicates that the first 
gene, termed either BcsA (7) or AcsAB (6) codes for the 83 kD 
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subunic of the cellulose synthase that binds the substrate 
UDP-glc and presumably catalyzes the polymerization of glucose 
residues to 1 , 4 -p-D-glucan (8). The second gene (B) of the 
operon is believed to function as a regulatory subunit binding 
cyclic-di-GMP (9) while recent evidence suggests that the C 
and D genes may code for proteins that form a pore allowing 
secretion of the polymer and control the pattern of 
crystallization of the resulting microfibrils (6) . 

Recent studies with another gram-negative bacterium, 
Agrrobacteriu/n tumefaciens, have also led to cloning of genes 
involved in cellulose synthesis (10,11), although the proposed 
pathway of synthesis differs in some respects from that of A. 
xylinum. In A. tumefaciens, a CelA gene showing significant 
homology to the BcsA/AcsAB gene of A. xylinum, is proposed to 
transfer glc from UDP-glc to a lipid acceptor; other gene 
products may then build up a lipid oligosaccharide that is 
finally polymerized to cellulose by the action of an 
endo-glucanase functioning in a synthetic mode. In addition, 
homologs of the CelA, B, and C genes have been identified in 
E. coli, but, as this organism is not known to synthesize 
cellulose in vivo, the function of these genes is not clear 
(2) . 

These successes in bacterial systems opened the 
possibility that homologs of the bacterial genes might be 
identified in higher plants. However, experments in a number 
of laboratories utilizing the A. xylinum genes as probes for 
screening plant cDNA libraries have failed to identify similar 
plant genes. Such lack of success suggests that, if plants do 
contain homologs of the bacterial genes, their overall 
sequence homology is not very high. Recent studies analyzing ' 
the conserved motifs common to glycosyltransf erases using 
either UDP-glc or UDP-GlcNAc as substrate suggest that there 
are specific conserved regions that might be expected to be 
found in any plant homolog of the catalytic subunit (referred 
to hereafter as CelA). In one of these studies, Delmer and 
Amor (2) identifed a motif common to many such 
glycosyltransf erases including the bacterial CelA proteins. 
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An independent analysis (6) also concluded that this motif was 
highly conserved in a group of similar glycosyitransf erases . 

Extending these studies further, Saxena et al . (12) 
presented an elegant model for the mechanism of catalysis for 
5 enzymes such as cellulose synthase that have the unique 

problem of synthesizing consecutive residues that are rotated 
approximately rotated 180° with respect to each other. The 
model invokes independent UDP-glc binding sites and, based 
upon hydrophobic cluster analysis of these enzymes, the 

10 authors concluded that 3 critical regions in all such 

processive glycosyitransf erases each contain a conserved 
aspartate (D) residue, while a fourth region contained a 
conserved QXXRW motif. The first D residue resides in the 
motif as previously analyzed (2,6). 

15 In general, genetic engineering techniques have been 

directed to modifying the phenotype of individual prokaryotic 
and eukaryotic cells, especially in culture. Plant cells have 
proven more intransigent than other eukaryotic cells, due not 
only to a lack of suitable vector systems but also as a result 

20 of the different goals involved. For many applications, it is 
desirable to be able to control gene expression at a 
particular stage in the growth of a plant or in a particular 
plant part. For this purpose, regulatory sequences are 
required which afford the desired initiation of transcription 

25 in the appropriate cell types and/or at the appropriate time 
in the plant ■ s development without having serious detrimental 
effects on plant development and productivity. It is 
therefore of interest to be able to isolate sequences which 
can be used to provide the desired regulation of transcription 

30 in a plant cell during the growing cycle of the host plant. 

One aspect of this interest is the ability to change the 
phenotype of particular cell types, such as differentiated 
epidermal cells that originate in fiber tissue, i.e. cotton 
fiber cells, so as to provide for altered or improved aspects 

35 of the mature cell type. Cotton is a plant of great 

commercial significance. In addition to the use of cotton 
fiber in the production of textiles, other uses of cotton 
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include food preparation with cotton seed oil and animal feed 
derived from cotton seed husks. 

A related goal involving the control of cell wall and 
characteristics would be to affect valuable secondary tree 
characteristics of wood for paper forestry products. For 
instance, by altering the balance of cellulose and lignin, the 
quality of wood for paper production may be improved. 

Finally, despite the importance of cotton as a crop, the 
breeding and genetic engineering of cotton fiber phenotypes 
has taken place at a relatively slow rate because of the 
absence of reliable promoters for use in selectively effecting 
changes in the phenotype of the fiber. In order to effect the 
desired phenotypic changes, transcription initiation regions 
capable of initiating transcription in fiber cells during 
development are desired. Thus, an important goal of cotton 
bioengineering research is the acquisition of a reliable 
promoter which would permit expression of a protein 
selectively in cotton fiber to affect such qualities as fiber 
strength, length, color and dyability. 

Relevant- Litaratlirg 

Cotton fiber-specific promoters are discussed in PCT 
publications WO 94/12014 and WO 95/08914, and John and Crow, 
Proc. Natl. Acad. Sci. USA, 89:5769-5773, 1992. cDNA clones 
that are preferentially expressed in cotton fiber have been 
isolated. One of the clones isolated corresponds to mRNA and 
protein that are highest during the late primary cell wall and 
early secondary cell wall synthesis stages. John and Crow, 
supra. 

In plants, control of cytoskeletal organization is poorly 
understood in spite of its importance for the regulation of 
patterns of cell division, expansion, and subsequent 
deposition of secondary cell wall polymers. The cotton fiber 
represents an excellent system for studying cytoskeletal 
organization. Cotton fibers are single cells in which cell 
elongation and secondary wall deposition can be studied as 
distinct events. These fibers develop synchronously within 
the boll following anthesis, and each fiber cell elongates for 
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about 3 weeks, depositing a thin primary wall (Meinert and 
Delmer, (1984) Plant Physiol. 59: 1088-1097; Basra and Malik, 
(1984) Int Rev of Cytol 89: 65-113). At the time of 
transition to secondary wall cellulose synthesis, the fiber 
5 cells undergo a synchronous shift in the pattern of cortical 
microtubule and cell wall microfibril alignments, events which 
may be regulated upstream by the organization of actin 
(Seagull, (1990) Protoplasma 159: 44-59; and (1992) In: 
Proceedings of the Cotton Fiber Cellulose Conference, National 

10 Cotton Council of America, Memphis RN, pp 171-192. 

Agrobacterium-mediated cotton transformation is described 
in Umbeck, United States Patents Nos . 5,004,863 and 5,159,135 
and cotton transformation by particle bombardment is reported 
in WO 92/15675, published September 17, 1992. Transformation 

15 of Brassica has been described by Radke et al . (Theor. Appl . 
Genet. (1988) 75;685-694; Plant Cell Reports (1992) 11:499- 
505. 

Genes involved in lignin biosynthesis are described by 
Dwivedi, U.N. , Campbell, W.H., Yu, J., Datla, R.S.S., Chiang, 

20 V.L., and Podila, G.K. (1994) ^Modification of lignin 

biosynthesis in transgenic Nicotiana through expression of an 
antisense O-methyltransf erase gene from Populus * 1 PI. Mol . 
Biol. 26: 61-71; and Tsai, C.J., Podila, G.K. and Chaing, V.L. 
(1995) ^Nucleotide sequence of Populus tremuloides gene for 

25 caffeic acid/5 hydroxyf erulic acid O-methyltransf erase 1 ' Pi. 
Physiol. 107: 1459; and also U.S. Patent No. 5,451,514 
(claiming the use of cinnamyl alcohol dehydrogenase gene in an 
antisense orientation such that the endogenous plant cinnamyl 
alcohol dehydrogenase gene is inhibited) . 

30 
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SUMMARY OP THE INVENTION 

Two cotton genes, CelAl and CelA2, have been shown to be 
highly expressed in developing fibers at the onset of 
secondary wall cellulose synthesis. Comparisons indicate that 
these genes and the rice CelA gene encode polypeptides that 
have three regions of reasonably high homology, both in terms 
of primary amino acid sequence and hydropathy, with bacterial 
CelA proteins. The fact that these homologous stretches are 
in the same sequential order as in the bacterial CelA proteins 
and also contain four sub- regions previously predicted to be 
critical for substrate binding and catalysis (12) argues that 
the plant genes encode true homologs of bacterial CelA 
proteins. Furthermore, the pattern of expression in fiber as 
well as our demonstration that at least one of these 
highly-conserved regions is critical for UDP-glc binding also 
supports this conclusion. 

Novel DNA promoter sequences are also supplied, and 
methods for their use are described for directing 
transcription of a gene of interest in cotton fiber. 

The developing cotton fiber is an excellent system for 
studies on cellulose synthesis as these single cells develop 
synchronously in the boll and, at the end of elongation, 
initiate the synthesis of a nearly pure cellulosic cell wall. 
During this transition period, synthesis of other cell wall 
polymers ceases and the rate of cellulose synthesis is 
estimated to rise nearly 100-fold in vivo (13). In our 
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continuing efforts to identify genes critical to this phase of 
fiber development, we have initiated a program sequencing 
randomly selected cDNA clones derived from a library prepared 
from mRNA harvested from fibers at the stage in which 
5 secondary wall synthesis approaches its maximum rate 
(approximately 21 dpa) . 

We have characterized two cotton {Gossypium hirsutum) 
cDNA clones and identified one rice (Oryza sativa) cDNA that 
are homologs of the bacterial CelA genes that encode the 
10 catalytic subunit of cellulose synthase. Three regions in the 
deduced amino acid sequences of the plant CelA gene products 
are conserved with respect to the proteins encoded by 
bacterial CelA genes. Within these conserved regions are four 
highly conserved subdomains previously suggested to be 
15 critical for catalysis and/or binding of the substrate 

UDP-glc. An overexpressed DNA segment of the cotton CelAl 
gene encodes a polypeptide fragment that spans these domains 
and effectively binds UDP-glc, while a similar fragment having 
one of these domains deleted does not. The plant CelA genes 
20 show little homology at the amino and carboxy terminal regions 
and also contain two internal insertions of sequence, one 
conserved and one hypervariable, that are not found in the 
bacterial gene sequences. Co .-.ton CelAl and CelA2 genes are 
expressed at high levels during active secondary wall 
25 cellulose synthesis in the developing fiber. Genomic Southern 
analyses in cotton demonstrate that CelA comprises a family of 
approximately four distinct genes. 

We report here the discovery of two cotton genes that 
show highly- enhanced expression at the time of onset of 
30 secondary wall synthesis in the fiber. The sequences of these 
two cDNA clones, termed CelAl and CelA2, while not identical, 
are highly homologous to each other and to a sequenced rice 
EST clone discovered in the dBEST databank. The deduced 
proteins also share signifigant regions of homology with the 
35 bacterial CelA proteins. Coupled with their high level and 
specificity of expression in fiber at the time of active 
cellulose synthesis, as well as the ability of an E. coli 
expressed fragment of the CelAl gene product to bind UDP-glc, 
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these findings support the conclusion that these plant genes 
are true homologs of the bacterial CelA genes. 

The methods of the present invention include transfecting 
a host plant cell of interest with a transcription or 
5 expression cassette comprising a cotton fiber promoter and 
generating a plant which is grown to produce fiber having the 
desired phenotype . Constructs and methods of the subject 
invention thus find use in modulation of endogenous fiber 
products, as well as production of exogenous products and in 

10 modifying the phenotype of fiber and fiber products. The 

constructs also find use as molecular probes. In particular, 
constructs and methods for use in gene expression in cotton 
embryo tissues are considered herein. By these methods, novel 
cotton plants and cotton plant parts, such as modified cotton 

15 fibers, may be obtained. 

The sequences and constructs of this invention may also 
be used to isolate related cellulose synthase genes from 
forest tree species, for use in transforming and modifying 
wood quality. As and example., lignin, an undesirable by- 

20 product of the pulping process, by be reduced by over- 
expressing the cellulose synthase product and diverting 
production into cellulose. 

Thus, the application provides constructs and methods of 
use relating to modification of cell and cell wall phenotype 

25 in cotton fiber and wood products. 

DESCRIPTION OF THE DRAWINGS 

Figure 1. Northern analysis of CelAl gene in cotton 
tissues and developing fiber. Approximately 10/ig total RNA 

30 from each tissue was loaded per lane. Blots were prepared and 
probe preparation and hybridization conditions were performed 
as described previously (14) . The entire CelAl cDNA insert 
was used as a probe in this experiment . Exposure time for 
the audoradiogram was seven hours at -70°. 

35 Figure 2. Cotton genomic DNA analysis for both the 

CelAl and CelA2 cDNAs . Approximately 10-12/xg of DNA was 
digested with the designated restriction enzymes and 
electrophoresed 0.9% agarose gels. Probe preparation and 
hybridization conditions were as described previously (14) . 
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The entire CelAl and CelA2 cDNAs were utiized as probes. 
Exposure time for the audoradiograms was three days at -70°. 

Figure 3. Multiple alignment of deduced amino acid 
sequences of plant and bacterial CelA proteins. Analyses were 
performed by Clustal Analysis n.sing the Lasergene Multalign 
program (DNAStar, Madison, WI) with gap and gap- length 
penalties of 10 and a PAM250 weight table. Residues are boxed 
and shaded when they show chemical group similarity in 4 out 
of 7 proteins compared. H-l, H-2, H-3 regions are indicated 
where homology between plant and bacterial proteins is 
highest. The plant proteins show two insertions that are not 
present in the bacterial protein- -one , P-CR, is conserved 
among the plant CelA genes, while a second insertion is 
hypervariable (HVR) between plant genes. The presence of the 
15 P-CR and HVR regions led to inaccurate alignments when the 
entire proteins were compared; the optimal alignments shown 
here were thus performed in five seperate blocks. Regions 
0-1 through 0-4 are predicted to be critical for ODP-glc 
binding and catalysis in bacterial CelA proteins; the 
20 predicted critical D residues and QXXRW motif are boxed and 
starred respectively. Potential sites of N-glycosylation are 
indicate by -G- . 

Figure 4. Kyte-Doolittle hydropathy plots of cotton 
CelAl aligned with those of two bacterial CelA proteins. 
25 Alignments and designations are based upon those noted in Fig. 
2. The hydropathy profiles shown were calculated using a 
window of 7 , although a window of 19 was used for predictions 
of transmembrane helices that are indicated by the arrows. 

Figure 5. An E. coli expressed GST cotton CelA-1 fusion 
30 protein binds the containing 01 through 04 binds ODP-glc in 

vitro. Panel A shows a hypothetical orientation of the cotton 
CelAl protein in the plasma membrane and indicates the 
cytoplasmic region containing the sub-domains 0-1 to 0-4. 
GST- fusion constructs for CelAl fragments spanning the region 
35 between the potential transmembrane helices (A through H) were 
prepared as described in Materials and Methods. The purified 
and blotted CelAl fusion protein fragments were tested as 
described in Materials and Methods for their ability to bind 
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32 P-UDP-glc (panel B) . M refers to the molecular weight 
markers while CS and JEU1 to the full- length and deleted GST- 
CelAl fusion polypeptides. The left panel shows proteins 
stained with Coomassie blue while the other three panels show 
5 representative autoradiograms under different binding 

conditions as described in Materials and Methods. Ph, BSA 
and Ova refer to the molecular weight standards phosphorylase 
b, bovine serum albumin and ovalbumin respectively. 

Figure 6. Nucleic acid sequences to cDNA of CelAl 
10 protein of cotton {Gossypium hirsutum) . 

Figure 7. Nucleic acid sequences to cDNA of CelA2 
protein of cotton (Gossypium hirsutum) , including 
approximately the last 3' two-thirds of the encoding region. 

Figure 8. Genomic nucleic acid sequences of CelAl 
15 protein of cotton (Gossypium hirsutum) , including 

approximately 900 bases of the promoter region 5' to the 
encoding sequences. 

DETAILED DESCRIPTION OF THE INVENTION 

20 In accordance with the subject invention, novel 

constructs and methods are described, which may be used 
provide for transcription of a nucleotide sequence of interest 
in cells of a plant host, preferentially in cotton fiber cells 
to produce cotton fiber having an altered color phenotype . 

25 Cotton fiber is a differentiated single epidermal cell of 

the outer integument of the ovale. It has four distinct 
growth phases; initiation, elongation (primary cell wall 
synthesis) , secondary cell wall synthesis, and maturation. 
Initiation of fiber development appears to be triggered by 

30 hormones. The primary cell wall is laid down during the 

elongation phase, lasting up to 25 days postanthesis (DPA) . 
Synthesis of the secondary wall commences prior to the 
cessation of the elongation phase and continues to 
approximately 40 DPA, forming a wall of almost pure cellulose. 

35 The constructs for use in such cells may include several 

forms, depending upon the intended use of the construct. 
Thus, the constructs include vectors, transcriptional 
cassettes, expression cassettes and plasmids. The 
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transcriptional and translational initiation region (also 
sometimes referred to as a "promoter,"), preferably comprises 
a transcriptional initiation regulatory region and a 
translational initiation regulatory region of untranslated 5' 
sequences, "ribosome binding sites," responsible for binding 
mRNA to ribosomes and translational initiation. It is 
preferred that all of the transcriptional and translational 
functional elements of the initiation control region are 
derived from or obtainable from the same gene. In some 
embodiments, the promoter will be modified by the addition of 
sequences, such as enhancers, or deletions of nonessential 
and/or undesired sequences. By "obtainable" is intended a 
promoter having a DNA sequence sufficiently similar to that of 
a native promoter to provide for the desired specificity of 
transcription of a DNA sequence of interest. It includes 
natural and synthetic sequences as well as sequences which may 
be a combination of synthetic and natural sequences. 

Cotton fiber transcriptional initiation regions of 
cellulose synthase are used in cotton fiber modification. 

A transcriptional cassette for transcription of a 
nucleotide sequence of interest in cotton fiber will include 
in the direction of transcription, the cotton fiber 
transcriptional initiation region, a DNA sequence of interest, 
and a transcriptional termination region functional in the 
plant cell, when the cassette provides for the transcription 
and translation of a DNA sequence of interest it is considered 
an expression cassette. One or more introns may be also be 
present . 

Other sequences may also be present, including those 
encoding transit peptides and secretory leader sequences as 
desired. 

Downstream from, and under the regulatory control of, the 
cellulose synthase transcriptional/translational initiation 
control region is a nucleotide sequence of interest which 
provides for modification of the phenotype of fiber. The 
nucleotide sequence may be any open reading frame encoding a 
polypeptide of interest, for example, an enzyme, or a sequence 
complementary to a genomic sequence, where the genomic 
sequence may be an open reading frame, an intron, a noncoding 
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leader sequence, or any other sequence where the complementary 
sequence inhibits transcription, messenger RNA processing, for 
example, splicing, or translation. The nucleotide sequences 
of this invention may be synthetic, naturally derived, or 
5 combinations thereof. Depending upon the nature of the DNA 
sequence of interest, it may be desirable to synthesize the 
sequence with plant preferred codons. The plant preferred 
codons may be determined from the codons of highest frequency 
in the proteins expressed in the largest amount in the 

10 particular plant species of interest. Phenotypic modification 
can be achieved by modulating production either of an 
endogenous transcription or translation product, for example 
as to the amount, relative distribution, or the like, or an 
exogenous transcription or translation product, for example to 

15 provide for a novel function or products in a transgenic host 
cell or tissue. Of particular interest are DNA sequences 
encoding expression products associated with the development 
of plant fiber, including genes involved in metabolism of 
cytokinins, auxins, ethylene, abscissic acid, and the like. 

20 Methods and compositions for modulating cytokinin expression 
are described in United States Patent No. 5,177,307, which 
disclosure is hereby incorporated by reference. 
Alternatively, various genes, from sources including other 
eukaryotic or prokaryotic cells, including bacteria, such as 

25 those from Agrobacterium tuwefaciens T-DNA auxin and cytokinin 
biosynthetic gene products, for example, and mammals, for 
example interferons, may be used. 

Alternatively, the present invention provides the 
sequences to cotton cellulose synthase, which can be 

30 expressed, or down regulated by antisense or co- suppression 
with its own, or other cotton or other fiber promoters to 
modify fiber phenotyp. 

In cotton, primary wall hemicellulose synthesis ceases as 
secondary wall synthesis initiates in the fiber, and there are 

35 only two possible P-glucans synthesized in fibers at the time 
these genes are highly-expressed; callose and cellulose (20) . 
The following data strongly argue against the plant CelA genes 
coding for callose synthase: 1) callose synthase binds UDP-glc 
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and is activated in a Ca 2+ -dependent manner (2) , while the 
CelAl polypeptide fragment containing the UDP-glc binding site 
preferentially binds UDP-glc in a Mg 2+ -dependent manner, 
similar to bacterial cellulose synthase (9) ; 2) the timing 
of synthesis of callose in vivo in developing cotton fiber 
(20) does not match the expression of the cotton CelA genes 
(Fig. 1); 3) comparison of the CelA gene sequences with those 
of suspected 1,3-p-glucan synthase genes from yeast (21) 
indicated no significant homology. 

It is still possibille that the CelA protein might encode 
both activities, as hypothesized some years ago (22-23), and 
the plant CelAs might be responsible for direct polymerization 
of glucan from UDP-glc as proposed for A . xylinum, although 
they may catalyze synthesis of a lipid-glc precursor as 
15 proposed for the CelA protein of A. tumefaciens . 

In addition to their similarities, the plant CelA genes 
show several very interesting divergences from their bacterial 
ancestors, and these may account for the previous lack of 
success in using bacterial probes to detect these cDNA clones. 
20 However, a BLAST search of protein data banks (24) using the 
entire protein sequence of cotton CelAl always shows highest 
homology with the bacterial cellulose synthases. Of 
particular interest is the insertion of two unique, 
plant-specific regions designated P-CR and HVR. These 
25 regions are clearly not artifacts of cloning as they are 

observed in both cotton genes as well as the rice CelA gene. 
The three plant proteins show a high degree of amino acid 
homology to each other throughout most of their length, 
diverging only at the N- and C- terminal ends and the very 
30 interesting HVR region. It is tempting to speculate that the 
HVR region may confer some specificity of function; the 
highly- charged and cysteine rich nature of the first portion 
of HVR could make this region a potential candidate for 
interaction with specific regulatory proteins, for 
35 cytoskeletal elements, or for redox regulation. In addition, 
we note the presence of several cysteine residues near the N- 
and C-terminal regions of the protein that might serve as 
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substrates for palmytolylation and also serve to help anchor 
the protein in the membrane (25) . 

In summary, the finding of these plant CelA homologs 
potentially opens up an exciting chapter in research on 
5 cellulose synthesis in higher plants. Their finding is of 
particular significance since biochemical approaches to 
identification of plant cellulose synthase have proven 
exceedingly difficult. One obvious challenge will be to gain 
definitive proof that these genes are truely functional in 
10 cellulose synthesisin vivo. Other promising goals will be to 
identify other components of a complex that might interact 
with CelA, such as that proposed for sucrose synthase (26) , 
and/or a regulatory subunit that binds cyclic-di-GMP (9,27) or 
other glycosyltransferases (10,11). 
15 Transcriptional cassettes may be used when the 

transcription of an anti-sense sequence is desired. When the 
expression of a polypeptide is desired, expression cassettes 
providing for transcription and translation of the DNA 
sequence of interest will be used. Various changes are of 
interest; these changes may include modulation (increase or 
decrease) of formation of particular saccharides, hormones, 
enzymes, or other biological parameters. These also include 
modifying the composition of the final fiber that is changing 
the ratio and/or amounts of water, solids, fiber or sugars. 
Other phenotypic properties of interest for modification 
include response to stress, organisms, herbicides, brushing, 
growth regulators, and the like. These results can be 
achieved by providing for reduction of expression of one or 
more endogenous products, particularly an enzyme or cof actor, 
either by producing a transcription product which is 
complementary (anti-sense) to the transcription product of a 
native gene, so as to inhibit the maturation and/or expression 
of the transcription product, or by providing for expression 
of a gene, either endogenous or exogenous, to be associated 
3 5 with the development of a plant fiber. 

The termination region which is employed in the 
expression cassette will be primarily one of convenience, 
since the termination regions appear to be relatively 
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interchangeable. The termination region may be native with 
the transcriptional initiation region, may be native with the 
DNA sequence of interest, may be derived from another source. 
The termination region may be naturally occurring, or wholly 
5 or partially synthetic. Convenient termination regions are 
available from the Ti-plasmid of A. tumefaciens, such as the 
octopine synthase and nopaline synthase termination regions. 
In some embodiments, it may be desired to use the 3' 
termination region native to the cotton fiber transcription 

10 initiation region used in a particular construct. 

As described herein, in some instances additional 
nucleotide sequences will be present in the constructs to 
provide for targeting of a particular gene product to specific 
cellular locations. 

15 Similarly, other constitutive promoters may also be 

useful in certain applications, for example the mas, Mac or 
DoubleMac, promoters described in United States Patent No. 
5,106,739 and by Comai et al . , Plant Mol . Biol. (1990) 15:373- 
381). When plants comprising multiple gene constructs are 

20 desired, the plants may be obtained by co- transformation with 
both constructs, or by transformation with individual 
constructs followed by plant breeding methods to obtain plants 
expressing both of the desired genes. 

A variety of techniques are available and known to 

25 those skilled in the art for introduction of constructs into a 
plant cell host. These techniques include transfection with 
DNA employing A. tumefaciens or A. rhizogenes as the 
transfecting agent, protoplast fusion, injection, 
electroporation, particle acceleration, etc. For 

30 transformation with Agrobacterium, plasmids can be prepared in 
E. coli which contain DNA homologous with the Ti-plasmid, 
particularly T-DNA. The plasmid may or may not be capable of 
replication in Agrobacterium, that is, it may or may not have 
a broad spectrum prokaryotic replication system such as does, 

35 for example, pRK290, depending in part upon whether the 

transcription cassette is to be integrated into the Ti-plasmid 
or to be retained on an independent plasmid. The 
Agrobacterium host will contain a plasmid having the vir genes 
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necessary for transfer of the T-DNA to the plant cell and may 
or may not have the complete T-DNA. At least the right border 
and frequently both the right and left borders of the T-DNA of 
the Ti- or Ri-plasmids will be joined as flanking regions to 
the transcription construct. The use of T-DNA for 
transformation of plant cells has received extensive study and 
is amply described in EPA Serial No. 120,516, Hoekema, In: The 
Binary Plant Vector System Of f set-drukkeri j Kanters B.V., 
Alblasserdam, 1985, Chapter V, Knauf, et al . , Genetic Analysis 
of Host Range Expression by Agrobacterium, In: Molecular 
Genetics of the Bacteria-Plant Interaction, Puhler, A. ed. , 
Springer-Verlag, NY, 1983, p. 245, and An, et al., EMBO J. 
(1985) 4:277-284. 

For infection, particle acceleration and electroporation, 
a disarmed Ti-plasmid lacking particularly the tumor genes 
found in the T-DNA region) may be introduced into the plant 
cell. By means of a helper plasmid, the construct may be 
transferred to the A. tumefaciens and the resulting 
transfected organism used for transfecting a plant cell; 
explants may be cultivated with transformed A. tumefaciens or 
A. rhizogenes to allow for transfer of the transcription 
cassette to the plant cells. Alternatively, to enhance 
integration into the plant genome, terminal repeats of 
transposons may be used as borders in conjunction with a 
transposase. In this situation, expression of the transposase 
should be inducible, so that once the transcription construct 
is integrated into the genome, it should be relatively stably 
integrated. Transgenic plant cells are then placed in an 
appropriate selective medium for selection of transgenic cells 
which are then grown to callus, shoots grown and plantlets 
generated from the shoot by growing in rooting medium. 

To confirm the presence of the transgenes in transgenic 
cells and plants, a Southern blot analysis can be performed 
using methods known to those skilled in the art. Expression 
products of the transgenes can be detected in any of a variety 
of ways, depending upon the nature of the product, and include 
immune assay, enzyme assay or visual inspection, for example 
to detect pigment formation in the appropriate plant part or 
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cells. Once transgenic plants have been obtained, they may be 
grown to produce fiber having the desired phenotype . The 
fibers may be harvested, and/or the seed collected. The seed 
may serve as a source for growing additional plants having the 
5 desired characteristics. The terms transgenic plants and 

transgenic cells include plants and cells derived from either 
transgenic plants or transgenic cells. 

The various sequences provided herein may be used as 
molecular probes for the isolation of other sequences which 

10 may be useful in the present invention, for example, to obtain 
related transcriptional initiation regions from the same or 
different plant sources. Related transcriptional initiation 
regions obtainable from the sequences provided in this 
invention will show at least about 60% homology, and more 

15 preferred regions will demonstrate an even greater percentage 
of homology with the probes. 

Of particular importance is the ability to obtain related 
transcription initiation control regions having the timing and 
tissue parameters described herein. Thus, by employing the 

20 techniques described in this application, and other techniques 
known in the art (such as Maniatis, et al., Molecular 
Cloning, - A Laboratory Manual (Cold Spring Harbor, New York) 
1982), other encoding regions or transcription initiation 
regions of cellulose synthase as described in this invention 

25 may be determined. The constructs can also be used in 

conjunction with plant regeneration systems to obtain plant 
cells and plants; thus, the constructs may be used to modify 
the phenotype of fiber cells, to provide cotton fibers which 
are colored as the result of genetic engineering to heretofor 

30 unavailable hues and/or intensities. 

Various varieties and lines of cotton may find use in the 
described methods. Cultivated cotton species include 
Gossypium hirsutum and G. babadense (extra-long stable, or 
Pima cotton) , which evolved in the New World, and the Old 

3 5 World crops G. herbaceum and G. arboreum. 

By using encoding sequences to enzymes which control wood 
quality and wood product characteristics, i.e., cellulose 
synthase and O-methyltransf erase (a key enzyme in lignin 
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biosynthesis) the relative synthesis of cellulose and lignin 
by plants may be controlled. Transformation of the plant 
genome with a recombinant gene construct which contains the 
gene specifying an enzyme critical to the synthesis of 
cellulose or lignin or a lignin precursor, in either a sense 
or in an antisense orientation. If an antisense orientation, 
the gene will transcribed so mRNA having a sequence 
complementary to the equivalent mRNA transcribed from the 
endogenous gene is expressed, leading to suppression of the 
synthesis of lignin or cellulose. 

If the recombinant gene has the lignin enzyme gene in 
normal, or "sense" orientation, increased production of the 
enzyme may occur when the insert is the full length DNA but 
suppression may occur if only a partial sequence is employed. 

Furthermore, the expression of one may be increased in 
this manner while the other is reduced. For instance, the 
production of cellulose may by increased through the 
overexpression of cellulose synthase, while lignin production 
is reduced. By thus reducing the relative lignin content, the 
quality of wood for paper production would be improved. 

Ry&MPT.K.c; 

The following examples are offered by way of illustration 
and not by limitation. 

Exampl p i 
cDNA 1 ihrari PR 

An unamplified cDNA library was used to prepare the 
Lambda Uni-Zap vector (Stratagene, LaJolla, CA) using cDNA 
derived from polyA+ mRNA prepared from fibers of Gossypium 
hirsutum Acala SJ-2 harvested at 21 DPA, the time at which 
secondary wall cellulose synthesis is approaching a maximal 
rate (13). Approximately 250 plaques were randomly selected 
from the cDNA library, phages purified and plasmids excised 
from the phage vector and transformed. 

The resulting clones/inserts were size screened on 0.8% 
agarose gels (DNA inserts below 600bp were excluded) . 



Example ? 
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Isolation and gpgn enci rtg of rDNA flnnpg 
Plasmid DNA inserts were randomly sequenced using an 
Applied Biosystems (Foster City, CA) Model 373A DNA sequencer. 
A search of the GenBank EST databank revealed that there were 
5 at least 23 rice and 8 Arabidopsis EST clones that contain 

sequences similar to the cotton CelAl DNA sequence. EST clone 
S14 965 was obtained from Y. Nagamura (Rice Genome Research 
Program, Tsukuba) . A series of deletion mutants were 
generated and used for DNA sequencing analysis at the Weizmann 
10 Institute of Science. (Rehovot) . 



Example 3 
Northern and soutJie ro analyses 
Cotton plants (G. hirsutum cv. Coker 130) were grown in 
15 the greenhouse and tissues harvested at the appropriate times 
indicated and frozen in liquid N 2 . Total cotton RNA and 
cotton genomic DNA was prepared and subjected to Northern and 
Southern analyses as described previously (14) . 

UDP-Olc Rinding Stud-j^g 
To construct a GST-CelAl protein fusion, a 1.6kb DNA 
CelAl DNA fragment containing a putative cytoplasmic domain 
between the second and third transmembrane helices was PCR 

25 amplified with the primers ATTGAATTCCTGGGTGTTGGATCAGTT and 

ATTCTCGAGTGGAAGGGATTGAAA in a reaction containing 1 ng plasmid 
DNA (clone 213) as template. The amplified fragment was 
unidirectionally cloned into the EcoRI and Xhol sites of the 
GST expression vector pGEX4T-3 (Pharmacia), generating a 

30 fusion protein GST-CS containing the amino acids Ser215 to 

Leu759 of the cotton CelAl protein. Two CelAl gene internal 
PstI sites within the plasmid pGST-CS were used to generate 
the deletion mutant pGST-CSAUl, which lacks 196 amino acids 
(and the Ul binding region) from Val252 to Ala447. 

35 For the UDGP binding assays, a- 3 2p-i a beled UDP-glc was 

prepared as described (15) . The two fusion proteins GST-CS 
and GST-CS£U1 were expressed in E. coli and purified from 
inclusion bodies (16) . Proteins were suspended in sample 
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buffer, heated to 100_ C for 5 min and approximately 50ng of 
the two fusion protein products and molecular weight standards 
(Bio-Rad) subjected to SDS-PAGE using 4.5% and 7.5% acrylamide 
in the stacking and separating gels, respectively (17) . After 
electrophoresis, protein transfer to nitrocellulose filters 
was carried out in transfer buffer (25mM Tris, 192mM glycine 
and 20% (v/v) methanol) . The filter was briefly rinsed in 
deionized H 2 0 and incubated in PBS buffer for 15 min, then 
stained with Ponceau-S in PBS buffer. After washing in 
deionized H 2 0, protein was further renatured on the filter by 
incubation in PBS buffer for 30 min and used directly for 
binding assays. All binding buffers contained 50mM HEPES/KOH 
(pH 7.3), 50mM NaCl and ImMDTT. In addition, binding buffers 
contained either 5mM MgCl 2 and 5mM EGTA (Buffer Mg/EGTA) , 5mM 
15 EDTA (Buffer EDTA) or ImM CaCl 2 and 20mM cellobiose (Buffer 
Ca/CB) . Binding reaction was carried out in 7ml containing 
32 P-labeled UDP-glc (Ix 10 7 cpm) at room temperature for 3 
hours with constant shaking. Filters were washed separately 
three times in 20ml washing buffer consisting of 50mM 
HEPES/KOH (pH 7.3) and 50mM NaCl for 5min each, briefly dried 
and analyzed on a Bio- imaging analyzer BAS1000 (Fugi) . 



20 



Examplp q 

Tdenti fiPRtion. Pi f f PTPnM ; Q F.Yprpqfiinn ^nH 
25 Genomic Analysis of Pr oton PpIA Hptipr 

During the course of screening and sequencing random cDNA 
clones from a cotton fiber specific cDNA library prepared from 
RNA collected approximately 21 dpa, it was discovered that two 
cDNA clones that initially exhibited small blocks of amino 
3 0 acid homology to the proteins encoded by the bacterial CelA 
genes. Clone 213 appeared to be full-length cDNA while 
another distinct clone, 207, appeared to be a partial clone 
relative to the length of 213. These two clones were 
partially homologous at the nucleotide and amino acid levels 
35 and designated CelAl and CelA2 respectively. 

These clones were then utilized as probes for Northern 
blot analysis to determine their differential expression in 
cotton tissues and developing cotton fiber. Figure 1 
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indicates the expression pattern for the CelAl gene. The 
CelAl gene encodes a mRNA of approximately 3 . 2kb in length and 
is expressed at extremely high levels in developing fiber, 
beginning at approximately 17 dpa, the time at which secondary 
5 wall cellulose synthesis is initiated (13 ) . The gene is also 
expressed at low levels in all other cotton tissues, most 
notably in root, flower and developing seeds. Since regions 
of these genes are somewhat homologous at the nucleotide 
level, gene specific probes were designed (using the 

10 hypervariable regions described in Fig 3) to distinguish the 
specific expression patterns of CelAl and CelA2 . These gene 
specific probes generated expression patterns (data not shown) 
for the two genes identical to that shown in Figure 1, except 
that a very low mRNA level was also detected in the primary 

15 wall phase of fiber development (5-14dpa) for the CelA2 gene 
when the blots were overexposed. The CelA2 gene specific 
probe also encoded a 3.2kb mRNA, analogous in size to the mRNA 
specified by the gene for CelAl. Messenger RNAs for both 
genes exhibit a characteristic degradation pattern similar to 

20 other mRNAs specifically expressed late in fiber development 
(J. Pear, unpublished observations) and this degradation is 
not a result of the integrity of the mRNA preparations (14). 
We estimate that both cotton CelA genes are expressed in 
developing fiber approximately 500 times their level of 

25 expression in other cotton tissues and that they constitute 
approximately 1-2% of the 24dpa fiber mRNA. 

In order to estimate the number of CelA genes in the 
cotton genome, Southern analysis was performed utilizing both 
CelA cDNAs independently as probes (Fig 2) . Although the two 

30 cotton genes are fairly non- homologous at the nucleotide level 
over their entire length, there are regions of homology (the 
HI, H2 and H3 regions described below) and it was thought 
these regions could be useful in identifying other cotton CelA 
genes. Figure 2 indicates that the CelAl cDNA probe will 

35 hybridize, albeit weakly, to the CelA2 genomic equivalent and 
vise versa. The Hindi I I pattern for both genes and cDNA 
probes is particularly discriminating. There are also a 
number of other weakly hybridzing bands in these digests and 
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from these data we estimate that the cotton CelA genes 
constitute a small family of approximately four genes. 
Homology of Plant and Bacterial CelA Gene Products. 

In addition to the two similar cotton CelA genes, a 
5 homologous cDNA clone was discovered in the dBest databank* of 
rice and Arabidopsis ESTs . Accession No. D48636, the rice 
clone having the longest insert was obtained and sequenced, 
and the homology comparisons with bacterial proteins reported 
here also include results with the rice CelA. Figure 3 shows 

10 the results of a multiple alignment of the deduced amino acid 
sequences from the three plant CelA genes and four bacterial 
CelA genes from A. xylinum (AcsAB and BcsA) , E. coli , and A . 
tumefaciens. Figure 4 shows hydropathy plots (18) of cotton 
CelAl similarly aligned with two bacterial CelA proteins and 

15 serves as a more general summary of the overall homologies. 

Of the plant genes, only the cotton CelAl appears to be a 
full-length clone of 3.2kb exhibiting an open reading frame 
that could potentially code for a polypeptide of 109,586 kD, a 
pi of 6.4, and four potential sites of N-glycosylat ion . 

20 Comparison of the N- terminal region of cotton CelAl with 
bacterial genes indicates that the plant protein has an 
extended N- terminal similar in length and hydropathy profile, 
but with only poor amino acid sequence homology to the A. 
tumefaciens CelA protein. In general, sequence homology of 

25 plant and bacterial genes in both the N-terminal and 

C- terminal regions is poor. However, although overall 
similarity comparing plant to bacterial proteins is less than 
25%, three homologous regions were identified, called H-l, 
H-2, and H-3, where the sequence similarity rises to 50-60% at 

30 the amino acid level. Interspersed between these regions of - 
homology are two plant-specific regions not found at all in 
the bacterial proteins. Sequences in the first of these 



The following accession numbers were identified as showing homology 
with cotton CelA-1. For rice: D48636, D41261, D40691, D46824, 
D47622, D47175, D41766, D41986, D246S5, D23732, D24375, D47732, 
D47821, D47850, D47494, D24964, D24862, D24860, D24711, D23841, 
D48053, D48612, D40673; for Arabidopsis: T45303, T45414, H76149, 
H36985, 230729, H36425, T45311, A35212. 
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insertions are highly conserved in the plant genes (P-CR) , 
while the second interspersed region seems to be a 
hypervariable regions (HVR) for there is considerable sequence 
divergence among the plant proteins analyzed. 
5 None of the plant or bacterial CelA proteins contains 

obvious signal sequences even though they are presumably 
transmembrane proteins (4) . However, the overall profiles 
suggest two potential transmembrane helices in the N-terminal 
and six in the C- terminal region of the cotton CelAl that 

10 could anchor the protein in the membrane (see arrows Fig. 3 and 
also panel A of Fig. 5). The amino acid sequence positions for 
these predicted transmembrane helices are: A (169-187), B 
(200-218), C (759-777), D (783-801), E (819-837), F (870- 
888), G (903-921), H (933-951). The central portions of the 

15 proteins are more hydrophilic and are predicted to reside in 
the cytoplasm and contain the site(s) of catalysis. More 
detailed inspection of these hydrophilic stretches reveals 
four particularly conserved sub-regions (marked U-l through 
U-4 on Figs. 3-4) that contain the conserved asp (D) residues 

2 0 (in U-l -3) and the motif QXXRW (in U-4) that have been 
proposed (12) to be involved in substrate binding and/or 
catalysis . 

Binding of UDP-glucose. Further evidence that the proteins 
encoded by these plant genes are CelA homologs comes from our 

25 demonstration that a DNA segment encoding the central region 
of the cotton CelAl protein, over-expressed in E. coli f binds 
UDP-glc. We subcloned a 1 . 6 kb fragment of the cotton CelAl 
clone to create a hybrid gene that encodes GST fused to the 
CelAl sequence encoding amino acid residues 215-759 of the 

30 CelAl protein (Fig. 5a) . This region spans U-l through U-4 
that are suspected to be critical for UDP-glc binding. As a 
control, another GST fusion was created using a 1.0 kb PstI 
fragment that had the U-l region deleted and might not be 
predicted to bind UDP-glc. The fusion proteins were 

35 overexpressed in E. coli , purifed, and shown to have the 

predicted sizes of approximately 87 and 64 kD, respectively 
(Fig. 5b) . The purified proteins were then subjected to 
SDS-PAGE, and blotted to nitrocellulose. Blotted proteins 
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were renatured, and incubated with 32 P-UDP-glc in order to 
test for binding (Fig. 5b) . As predicted, the 87 kD GST-CelAl 
fusion does indeed bind UDP-glc in a Mg 2+ dependent manner, 
while the shorter fusion with the U-l domain deleted did not 
5 show any binding (Although not observed in the experiment 

shown, in some experiments very weak labeling in the presence 
of Ca 2+ could be observed) . As further controls, note that 
the molecular weight . standards BSA and ovalbumin, proteins 
lacking UDP-glc binding sites, show no interaction with 

10 UDP-glc, while phosphorylase b, an enzyme inhibited by UDP-glc 
(19), binds this substrate. 

Figure 6 provides the encoding sequence to the cDNA to 
celAl (start ATG at ~ base 179) , while Figure 7 provides the 
encoding sequence to the approximately two-thirds 3' of the 

15 cDNA to celA2 . 

Example 6 
Genomic DNA 

cDNA for the cellulose synthase clones was used to probe 
20 for genomic clones. For both, full length genomic DNA was 
obtained from a library made using the lambda dash 2 vector 
from Stratageneig, which was used to construct a genomic DNA 
library from cotton variety Coker 130 (Gossypium hirsutum cv. 
coker 130), using DNA obtained from germinating seedlings. 
25 The cotton genomic library was probed with a cellulose 

synthase probe and genomic phage candidates were identified 
and purified. Figure 8 provides an approximately 1 kb 
sequence of the cellulose synthase promoter region which is 
immediately 5' to the celAl encoding region. The start of the 
30 cellulose synthase enzyme encoding region is at the ATG at 
base number 954 . 

Examplp 7 
Cotton Tr^.gfnrm^ion 

35 Exnlanf PrPpar-aMrm 

Promoter constructs comprising the cellulose synthase 
promoter sequences of celAl can be cotton prepared. Coker 315 
seeds are surface disinfected by placing in 50% Clorox (2.5% 
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sodium hypochlorite solution) for 20 minutes and rinsing 3 
times in sterile distilled water. Following surface 
sterilization, seeds are germinated in 25 x 150 sterile tubes 
containing 25 mis 1/2 x MS salts: 1/2 x B5 vitamins: 1.5% 
5 glucose: 0.3% gelrite. Seedlings are germinated in the dark 
at 28°C for 7 days. On the seventh day seedlings are placed in 
the light at 28±2°C. 

Cocul ti vat ion and Plant- Ppgp npraHnn 

10 Single colonies of A. tumefaciens strain 2760 containing 

binary plasmids pCGN2917 and pCGN2926 are transferred to 5 ml 
of MG/L broth and grown overnight at 30°C. Bacteria cultures 
are diluted to 1 x 10 8 cells/ml with MG/L just prior to 
cocul tivat ion. Hypocotyls are excised from eight day old 

15 seedlings, cut into 0.5-0.7 cm sections and placed onto 

tobacco feeder plates (Horsch et al . 1985). Feeder plates are 
prepared one day before use by plating 1 . 0 ml tobacco 
suspension culture onto a petri plate containing Callus 
Initiation Medium CIM without antibiotics (MS salts: B5 

20 vitamins: 3 % glucose: 0.1 mg/L 2,4-D: 0.1 mg/L kinetin: 0.3% 
gelrite, pH adjusted to 5 . 8 prior to autoclaving) . A sterile 
filter paper disc (Whatman #1) was placed on top of the feeder 
cells prior to use. After all sections are prepared, each 
section was dipped into an A. tumefaciens culture, blotted on 

25 sterile paper towels. and returned to the tobacco feeder 
plates . 

Following two days of cocultivat ion on the feeder plates, 
hypocotyl sections are placed on fresh Callus Initiation 
Medium containing 75 mg/L kanamycin and 500 mg/L 

30 carbenicillin. Tissue is incubated at 28±2°C, 30uE 16:8 
light: dark period for 4 weeks. At four weeks the entire 
explant is transferred to fresh callus initiation medium 
containing antibiotics. After two weeks on the second pass, 
the callus is removed from the explants and split between 

35 Callus Initiation Medium and Regeneration Medium (MS salts: 

40mM KNO3 : 10 mM NH4C1:B5 vitamins: 3% glucose: 0.3% gelrite: 400 
mg/L carb:75 mg/L kanamycin). 
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Embryogenic callus is identified 2-6 months following 
initiation and was subcultured onto fresh regeneration medium. 
Embryos are selected for germination, placed in static liquid 
Embryo Pulsing Medium (Stewart and Hsu medium: 0.01 mg/1 NAA: 
5 0.01 mg/L kinetin: 0.2 mg/L GA3 ) and incubated overnight at 

30°C. The embryos are blotted on paper towels and placed into 
Magenta boxes containing 4 0 mis of Stewart and Hsu medium 
solidified with Gelrite. Germinating embryos are maintained at 
28±2°C 50 uE m- 2 s" 1 16:8 photoperiod. Rooted plantlets are 

10 transferred to soil and established in the greenhouse. 

Cotton growth conditions in growth chambers are as 
follows: 16 hour photoperiod, temperature of approximately 80- 
85 °, light intensity of approximately 500/iEinsteins . Cotton 
growth conditions in greenhouses are as follows: 14-16 hour 

15 photoperiod with light intensity of at least 400^Einsteins , 
day temperature 90-95°F, night temperature 70-75°?, relative 
humidity to approximately 80%. 

Plant- An*lyg-j, s 

20 Flowers from greenhouse grown Tl plants are tagged at 

anthesis in the greenhouse. Squares (cotton flower buds), 
flowers, bolls etc. are harvested from these plants at various 
stages of development and assayed for observable phenotype or 
tested for enzyme activity. 



Example 7 
Transformation of Ttpp Species 

Numerous methods are known to the art for transforming 
forest tree species, for example U.S. Patent No. 5,654,190 
discloses a process for producing transgenic plant belonging 
to the genus Populus, the section Leuce . 



The above results demonstrate how the cellulose synthase 
cDNA may be used to alter the phenotype of a transgenic plant 
cell, and how the promoter may be used to modify transgenic 
cotton fiber cells. 
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All publications and patent applications cited in this 
specification are herein incorporated by reference as if each 
individual publication or patent application are specifically 
5 and individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in 
some detail, by way of illustration and example for purposes 
of clarity and understanding, it will be readily apparent to 
those of ordinary skill in the art that certain changes and 
10 modifications may be made thereto, without departing from the 
spirit or scope of the appended claims. 
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CLAIMS 

What is claimed is: 

1. An isolated DNA encoding sequence to a plant 
cellulose synthesis enzyme. 

2. The DNA encoding sequence of Claim 1 wherein 
said cellulose synthesis enzyme is cellulose synthase. 

3. The DNA encoding sequence of Claim 2 wherein 
said cellulose synthase is from cotton. 

4 . The DNA encoding sequence of Claim 3 wherein 
said cotton cellulose synthase is celAl . 

5. The DNA encoding sequence of Claim 4 wherein 
said celAl is encoded by the sequence of Figure 6. 

6. The DNA encoding sequence of Claim 3 wherein 
15 said cotton cellulose synthase is celA2 . 

7. The DNA encoding sequence of Claim 6 wherein 
said celA2 is encoded by the sequence of Figure 7. 

8 . An isolated DNA encoding sequence to a plant 
cellulose synthesis promoter region. 

20 9 - Th e promoter encoding sequence of Claim 8 

wherein said cellulose synthesis promoter region is to 

cellulose synthase . 

10. The promoter sequence of Claim 9 wherein said 

cellulose synthase promoter region is from cotton. 
25 11 • The promoter sequence of Claim 10 wherein said 

cotton cellulose synthase promoter region is from celAl . 
12. The promoter sequence of Claim 11 wherein said 

cotton cellulose synthase promoter region is the from 

sequence of Figure 8 . 
30 13 . A recombinant DNA construct comprising any of 

the DNA encoding sequences of Claims 1-10. 

14. The DNA construct of Claim 13 comprising as 
operably joined components in the direction of 
transcription, a cotton fiber transcriptional factor and 

35 the sequence of any of Claims 1-7. 

15. A plant cell comprising a DNA construct of Claims 13 
or 14. 

16. A plant comprising a cell of Claim 15. 
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17. A mechod of modifying fiber phenotype in a 
cotton plant, said method comprising: 

transforming a plant cell with DNA comprising a 
construct of Claims 13 or 14. 

18. A method of modifying the wood quality 
phenotype in a forest tree species, said method 
comprising : 

transforming a plant cell of said species with 
DNA comprising a construct of Claim 13 . 

19. A method according to Claim 18 wherein said 
cellulose sythesis enzyme is cellulose synthase and 
wherein the encoding sequence is in an antisense 
orientation, wherein transcribed mRNA from said sequence 
is complementary to the equivalent mRNA transcribed from 
the endogenous gene, whereby the synthesis of cellulose 
in said plant cell is suppressed. 

20. A method according to Claim 18, wherein said 
cellulose sythesis enzyme is cellulose synthase and 
wherein the encoding sequence is in a sense orientation, 
and wherein the synthesis of cellulose in said plant cell 
is increased. 

21. A method according to Claim 20 wherein said 
plant cell additionally comprises a construct encoding a 
sequence to an enzyme involved in the synthesis of 
lignin or a lignin precursor. 

22. A method according to Claim 20 wherein said 
lignin encoding sequence is in an antisense orientation, 
wherein transcribed mRNA from said sequence is 
complementary to the equivalent mRNA transcribed from 
the endogenous gene, whereby the synthesis of lignin is 
suppressed. 
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>Hpal 




37TGG7CGAGATGTATGCTATG7GCAG77 

XTlal 
I 

| 1700 

I 

7CC7CAAAGA777GA7GGCA7AGA7A-3GAS7GATCGATATGCCAATAGGA 

>Hpal 
I 

ACACAG7777C777GA7G77AACA7GAAAGG7C7TGA7GGAA7CCAAGGG 

1800 

CCAG777A7G7GGGAACAGG77G7G7777CAATAGGCAAGCAC777A7GG 

crATGGrccACCTTCAxrGCCAAsrrrrcccAASTCATCCTCcrcArcrr 

>Smal 

I 

[ 1900 

1 

GC 7CGTG 7 7337 3C33 3GGC AAGAAGGAAC37X-JLGA7CCA7CAGAGC77 
7A7AGGGA7GCAAAACGGGAAGAAC773A7GC7GCCA7C7TTAACC77AG 

2CO0 

GGAAA77GACAA77A73A73AG7A7GAAAGA7CAATG77GA7C7C7CAAA 

>Hind3 
i 

CAAGCrrrGAGAAAACrrriGGCrrArCrrCAGTCrTCATTGAATCTACA 

2100 

CT>Jl?GGAGAArGGAGGAGTGGCTGAArCTGCCAACCCTTCCACACrAAi 

CAAGGAAGCAAT7CA7G7CATCAGC7G7GGC7ATGAAGAGAAGAC7GCA7 

>EcoR5 
I 

1 2200 
I 

GGGGGAAAGAGA77G3A7GGA7A7A73G77CAG7CAC7GAGGA7A7C77A 

>Clal >S?hl 

1 I 
ACCGGC77CAAAA7GCAC7GCCGAGGA73GAGA7CGAT7TAC7GCA7GC: 

2300 

C?7AAGGCCAGCA??CAAAGGA7C?3CAC£CATCAA?C?G7CTGATCGG? 

FIGURE 6C 
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7GCACCAGG77C7?CGA7GGGCTCT7GGA7CTG7TGAAATT?T::C7AAGC 

2400 

AGGCATTGCCCTC7ATGGTATGGCTTTGGAGGTGGTCGTCTTAAATG5C? 

7CAAAGAC7AGCA7ATA7AAACACCAT7G7C7A7CCT77CACA7CCC77C 

2500 
» 

CACTCA77GCC7AT7G7TCAC7ACCAGCAA7C7G7C77C7CACAGGAAAA 

7T7A7CA7ACCAACGC7C7CAAACCTGGCAAG7G7TC7C777CTTGGCC? 

>Xhol >Sacl 
! i 

| | 2600 

I I 

TTTC7T7TCCA7TA7CG7GACTGCTG77CTCGAGCTCCGA7GGAG7GG7G 
7 C AGCA7TGAGG AC77 A7GGCG7 AACGAGC AG7 7TT GGG7CA7 C GG7GGC 

2"2C 

GGGCAT7GACACGAAC777ACTGTCAC7GCCAAAGCAGCTGA7GATGCAG 

>Sacl 

I 

| 2S00 
i 

ATTrTGGTGAGCTCTACATTGTGAAATGGACTACACTTCrAArCGCTCCA 



ACAACAC7CG7CA7CG7CAACA7GG7TGG7G7CG7TGCC3GA7TC7C3GA 

>Hind3 

I 

| 2300 

1 

7GCCC7CAACAAAGGG7ACGAAGC77GGGGACCAC7C7T7GGCAAAG7GT 
7C777TCC7TC7GGG7CA7CC7CCA7C777A7CCA77CC7CAAAGG7C77 

2:00 

A7GGGACGCCAAAACAGGACACCAACCA77G77G7CC777GG7CAG7G77 
GT7GGC77C7G7C77C7C7C7TG777GGG7TCGGA7CAACCCG777G7CA 

3100 

GCACCGCCGA7AGCACCACCGTG7CACAGAGC7GCA77TCCATTGA77G7 

TGATGA7AT7A7G7G777C77AGAA77GAAA7CAT7GCAAG7AAG7GGAC 

3200 
• 

7GAAACA7G7C7A7TGAC7AAG7777GAACAG777GTACCCA7777A77C 
77AGCAG7GTG7AA7777CCTAAACAA7GC7A7GAAC7A7ACA7A777CA 

FIGURE GD 
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>P3tl 

I 

I 3300 

I 

TrGATATTTACAriAAArGAAACTACATCAGTCTGCAGAAAAAAAAAAAA 

>Xhol >Apal 

I I 
AAAAAAAAACTCGAGGGGGGGCCCGGTA 
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>Smal 
I 

>Aval >EcoRl 
I I 1 
>Spel >BamKl I I >Pstl 

I I II II 

I I I I 20 I I 40 60 

i i * i . *■ i * * 9 * * * 

AACT^TGGATfcCCCCGGGCJGCAGGAA^^ 

TTGATCACCTAGGGGGCCCGACGTCCTTAAGCCGTGCTCGCTCCTCTACCCAAGGCAAAACATTCTTCGT 

>Narl >TthlllI 

I I 
80 I 100 120 | 140 

I * * « * * I * 

TAATGTTGAGCCCAGGGCGCCGGAGTTTTATTTCAATGAGAAGATTGAT1ATTTGAAGGACAAGGTCCAT 
ATTACAACTCGGGTCCCGCGGCCTCAAAATAAAGTTACTCTTCTAACTAATAAACTTCCTGTTCCAGGTA 

>DraI >Nsil 
. I I 
160 180 ! 200 I 

★***!*{* 
CCTAGCTTTGTTAAAGAACGGAGAGCCATGAAAAGGGAATATGAAGAATTTAAAGTAAGGATCAATGCAT 
GGATCGAAACAATTTCrTGCCTCTCGGTACTTTTCCCTTATACTTCTTAAATTTCATTCCrAGTTACGTA 

>Ncol 
I 

220 240 260 I 280 

* * * * * | * * 

TAGTAGCAAAAGCTCAGAAGAAACCAGAAGAAGGATGGGTGATGCAAGATGGCACCCCATGGCCCGGAAA 
ATCATCGTTTTCGAGTCTTCTTTGGTCTTCTTCCTACCCACTACGTTCTACCGTGGGGTACCGGGCCTTT 

>Bcll >ApaLl 

1 I 

I 300 320 I 340 

*****!** 
TAACACTCGTGATCATCCTGGAATGATTCAGGTCTATCTAGGAAGTGCCGGTGCACTCGATGTGGATGGC 
ATTGTGAGCACTAGTAGGACCTTACTAAGTCCAGATAGATCCTTCACGGCCACGTGAGCTACACCTACCG 

360 380 400 420 



* 



AAAGAGCTGCCTCGACTTGTCTATGTTTCTCGTGAGAAACGACCTGGTTATCAGCACCATAAGAAAGCCG 
TTTCTCGACGGAGCTGAACAGATACAAAGAGCACTCTTTGCTGGACCAATAGTCGTGGTATTCTTTCGGC 

>P3tl 

I 

440 I 460 480 

* * * | * * * * 

GTGCTGAGAATGCTCTGGTTCGAGTTTCTGCAGTGCTTACTAATGCACCCTTCATATTGAATCTGGATTQ 
CACGACTCTTACGAGACCAAGCTCAAAGACGTCACGAATGATTACGTGGGAAGTATAACTTAGACCTAAC 

>Bcll >BamHl 

I I 

| 500 520 540 | 560 

j * * • * * I * * 

TGATCATTACATCAACAATAGCAAGGCCATGAGGGAAGCGATGTGCTTTTTAATGGATCCTCAGTTTGGA 

ACTAGTAATGTAGTTGTTATCGTTCCGGTACTCCCTTCGCTACACGAAAAATTACCTAGGAGTCAAACC* 

>Hind3 >3s P H1 >Clal 

1 FIGURE 7 A 11 
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I 580 600 ! 1620 

I • * * *| I • * 

AAGAAGCTTTGTTATGTTCAATTTCCACAGAGATTTGATGGTATTGATCGTCATGATCGATATGCTAATC 
TTCTTCGAAACAATACAAGTTAAAGGTGTCTCTAAACTACCATAACTAGCAGTACTAGCTATACGATTAG 

>EcoR5 
I 

640 I 660 680 700 

******* 

GAAATGTTGrCTTCTTTGATATCAACATGTTGGGATTAGATGGACTTCAAGGCCCTGTATATGTAGGCAC 
CTTTACAACAGAAGAAACTATAGTTGTACAACCCTAATCTACCTGAAGTTCCGGGACATATACATCCGTG 

>Dra3 >AlwNl 

I I 
1 720 740 I 760 

j * * * * * * * 

AGGGTGTGTTTTCAACAGGCAGGCATTGTATGGCTACGATCCACCAGTCTCTGAGAAACGACCAAAGATG 
TCCCACACAAAAGTTGTCCGTCCGTAACATACCGATGCTAGGTGGTCAGAGACTCTTTGCTGGTTTCTAC 

780 800 820 840 

w ***** * 

ACATGTGATTGCTGGCCTTCTTGGTGTTGCTGTTGTTGCGGAGGTTCTAGGAAGAAATCAAAGAAGAAAG 
TGTACACTAACGACCGGAAGAACCACAACGACAACAACGCCTCCAAGATCCTTCTTTAGTTTCTTCTTTC 

860 880 900 

****** 
GTGAAAAGAAGGGCTTACTCGGAGGTCTTTTATACGGAAAAAAGAAGAAGATGATGGGCAAAAACTATGT 
CACTTTTCTTCCCGAATGAGCCTCCAGAAAATATGCCTTTTTTCTTCTTCTACTACCCGTTTTTGATACA 

920 940 960 980 

******* 
GAAAAAAGGGTCTGCACCAGTCTTTGATCTCGAAGAAATCGAAGAAGGGCTTGAAGGATACGAAGAATTG 
CTTTTTTCCCAGACGTGGTCAGAAACTAGAGCTTCTTTAGCTTCTTCCCGAACTTCCTATGCTTCTTAAC 

>Asel >Xmnl >Xrcnl 

1 I I 

| 1000 I 1020 1040 

* I * | * * * * * 

GAGAAATCGACATTAATGTCGCAGAAGAATTTCGAGAAACGATTCGGACAATCACCGGTTTTCATTGCCT 
CTCTTTAGCTGTAATTACAGCGTCTTCTTAAAGCTCTTTGCTAAGCCTGTTAGTGGCCAAAAGTAACGGA 

>Xmnl 
I 

1060 1080 I 1100 1120 

• * * | * * * 
CAACTTTGATGGAAAATGGTGGCCTTCCTGAAGGAACTAATTCCACATCACTGATTAAAGAGGCCATTCA 
GTTGAAACTACCTTTTACCACCGGAAGGACTTCCTTGATTAAGGTGTAGTGACTAATTTCTCCGGTAAGT 

1140 1160 1180 

******* 
CGTAATTAGCTGTGGTTATGAAGAAAAAACTGAGTGGGGCAAAGAGATCGGATGGATTTATGGGTCGGTG 
GCATTAATCGACACCAATACTTCTTTTTTGACTCACCCCGTTTCTCTAGCCTACCTAAATACCCAGCCAC 

>Nsil 
I 

1200 1220 I 1240 1260 

* * * | * * * * 
ACGGAAGATATATTAACAGGTTTCAAGATGCATTGTAGAGGGTGGAAATCGGTTTATTGTGTACCGAAAA 
TGCCTTCTATATAATTGTCCAAAGTTCTACGTAACATCTCCCACC7TTAGCCAAATAACACATGGCTTTT 

1280 1300 1320 

* * * * * * * 

FIGURE 7B 
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GACCGGCATTCAAAGGGTCCGCTCCAATCAATCTCTCGGATCGGTTGCACCAAGTTTTGAGATGGGCACT 
CTGGCCGTAAGTTTCCCAGGCGAGGTTAGTTAGAGAGCCTAGCCAACGTGGTTCAAAACTCTACCCGTGA 

1340 1360 13B0 1400 

» * * * * * * 

TGGTTCTGTAGAAATTTTCCTTAGTCGTCACTGTCCAC7TTGGTATGGTTATGGTGGAAAACTGAAATGG 
ACCAAGACATCTTTAAAAGGAATCAGCAGTGACAGGTGAAACCATACCAATACCACCTTTTGACTTTACC 

>Aval 
I 

>PaeR7I 
I 

>Xhol 
I 

| 1420 1440 1460 

| » * * * * * * 

CTCGAGAGGCTTGCTTATATCAACACCATTGTTTACCCTTTCACCTCGATCCCTTTACTCGCCTATTGTA 
GAGCTCTCCGAACGAATATAGTTGTGGTAACAAATGGGAAAGTGGAGCTAGGGAAATGAGCGGATAACAT 



>Pvu2 
I 

1480 1500 1520 1540 

I « * « * * * * 

CTATTCCAGCTGTTTGTCTTCTCACCGGCAAATTCATCATTCCAACTCTAAGCAACCTTACAAGTGTGTG 
GATAAGGTCGACAAACAGAAGAGTGGCCGTTTAAGTAGTAAGGTTGAGATTCGTTGGAATGTTCACACAC 

1560 15B0 1600 

******* 

GTTCTTGGCACTTTTCCTCTCCATCATTGCAACTGGAGTGCTTGAACTTCGATGGAGCGGGGTTAGCATC 
CAAGAACCGTGAAAAGGAGAGGTAGTAACGTTGACCTCACGAACTTGAAGCTACCTCGCCCCAATCGTAG 

1620 1640 1660 1680 

******* 

CAAGACTGGTGGCGCAATGAACAATTCTGGGTGATCGGAGGTGTCTCCGCCCATCTTTTTGCTGTCTTCC 
GTTCTGACCACCGCGTTACTTGTTAAGACCCACTAGCCTCCACAGAGGCGGGTAGAAAAACGACAGAAGG 

1700 1720 1740 

******* 

AGGGCCTCCTCAAAGTCCTAGCTGGAGTAGACACCAACTTCACCGTAACAGCAAAAGCAGCAGACGATAC 
TCCCGGAGGAGTTTCAGGATCGACCTCATCTGTGGTTGAAGTGGCATTGTCGTTTTCGTCGTCTGCTATG 

>EcoRx 

1 - 

I 1760 1780 1800 1820 

| * * * * * * * 

AGAATTCGGTGAACTTTATCTCTTCAAATGGACAACTCTCTTAATCCCTCCCACAACTCTGATAATACTG 
TCTTAAGCCACTTGAAATAGAGAAGTTTACCTGTTGAGAGAATTAGGGAGGGTGTTGAGACTATTATGAC 

1840 I860 1880 

******* 

AACATGGTCGGAGTCGTGGCCGGAGTTTCAGACGCAATCAACAACGGCTATGGTTCATGGGGTCCATTGT 
TTGTACCAGCCTCAGCACCGGCCTCAAAGTCTGCGTTAGTTGTTGCCGATACCAAGTACCCCAGGTAACA 

1900 1920 1940 1960 

w ***** * 

TCGGCAAACTGTTCTTCGCATTCTGGGTCATTCTTCATCTTTACCCATTCCTCAAAGGTTTGATGGGGAG 
AGCCGTTTGACAAGAAGCGTAAGACCCAGTAAGAAGTAGAAATGGGTAAGGAGTTTCCAAACTACCCCTC 

>Clal 
I 

1980 2000 I 2020 

«**«*•» 

FIGURE 7C 
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ACAAAACAGGACGCCCACCATTGTTGrGCTTTGGTCCATACTTTTGGCATCGATTTTCrCACTGGTTTGG 
TGTTTTGTCCTGCGGGTGGTAACAACACGAAACCAGGTATGAAAACCGTAGCTAAAAGAGTGACCAAACC 

>Clal 

I 

2040 2060 2080 2100 

( * * * * * * * 

GTACGGATCGATCCCTTCTTGCCCAAACAAACAGG7CCAGTTCTTAAACAATGTGGCGTGGAGTGCTAAA 
CATGCCTAGCTAGGGAAGAACGGGTTTGTTTGTCCAGGTCAAGAATTTGTTACACCGCACCTCACGATTT 

2120 2140 2160 

******* 

TGGTGTTTTACAAACCTTTCTTATTATTTTATTTTCCCTTTTTGCCACTACTGTToArTTTGCTGTGATTC 
ACCACAAAATGTT TGGAAAGAAT AATAAAAT AAAAGGGAAAAACGG TGATGACAACT AAACG ACACT AAG 

2180 2200 2220 2240 

* ****** 

TAAAAGGGATTTATCTTGTTTGTAAAAAGTCTCCTATGATTTTGTTGGTTCAATTTAATTTCTATATGGT 
ATTTTCCCTAAATAGAACAAACATTTTTCAGAGGATACTAAAACAACCAAGTTAAATTAAAGATATACCA 

>PaeR7I 
I 

>Aval >Asd718 

I I 

>Sspl >DraI >Xhol >Aoai; >Kpnl 

It I III 

| 12260 2280 I 23001 I 

1 * I * * * | * |*| j 

AAAAAAATATTTCTTTAAATTAACTATAAAAAAAAAAAAAAAAAACTCGAGGGGGGGCCCGGTACC 
TTTTTTTATAAAGAAATTTAATTGATATTTTTTTTTTTTTTTTTTGAGCTCCCCCCCGGGCCATGG 
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10 20 30 40 50 6C 

******* ***** 

GGGTGATTGACTAAAATTTTTAAAAATTTTGAAGGTTTTAATGAGAATTTTTAAACAAT? 

70 80 90 100 110 120 

******* ***** 

TTGTATGTTAAACTAAAACTTTCAAAAAAAATTTTGAAAGG7TTAATGAGAATTTTAAAA 

130 140 150 160 170 1BC 

************ 

ATTTTGAGCGGGCTAATTAAAATTTTTAAAAAATGTATAATAAAAAAATTCAAAAACTCT 

>Aoal 
"l 

190 200 | 210 220 230 240 

* ****** * * *** * 

TTGAGGCCATAAAGGTCATCGGGCCCTTAAATACATCAGCTTGTTGTTTCCTCATATTAC 

>Hoal 
I 

250 | 260 270 280 290 300 

* * *j* * * * * * * * * 

7CATGTTATTTCAG7TAACAGATATAATGGCTATCATTTGATTTAGGAGTC-AAATCTAAA 

>PacI 
i 

310 320 330 340 350 360 

* * * * * ** * * *j« * * 

AATTCGAAAAGTATAAAAACTAAAAAGGATTAAATTGAAGAACATTAATTAAATCAACAA 

>Hpal 

I 

370 380 390 400 410 420 

************ 

TTTACTATTCCAATAACAGAATTTTGAGTTAACAAATTTAACTGCTACAATTTGGT*CGA 

>Bcll 
I 

430 440 450 4601 470 480 

* * * * * * * *|* * * * 

GACCAAAATTACAAAACCCGAAAAGTATTGGGACTAAAATTGATCAAATTAGAGTACATG 

490 500 510 520 530 540 

******* ***** 

GGTTAAATTCACAACTTACTTATGGTACAAGGATTAATAGCATAATTTCTCCTTAGGCAA 

>Hind3 
I 

550 560 570 | 580 590 600 

* *** ** * | * ** ** 

ATGCCAGTTAGTTAAAGATGTACCTTGCCCAACCGAAAGCTTCCTTAAACTTCCCGCAAT 

>Hind3 
I 

610 620 630 640 I 650 660 

* * w * * * * *jw * * * 

TTTTTAAATTTCTTTTTCCCTTAGAAAAAAGAACAAAAATGTAAGCTTTGCTTGTCAGAG 
670 680 690 700 710 720 
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ATTTCTCTGCAAATACATTGACACCAACAACCTACCCTCCATTACACTACCAACCGGCCT 

730 740 750 760 770 780 

****** 
TCCCCTTCAACTTTTCTTCACCATTACAACATGCCTATCTCCACCCTTAGCCCAACATGC 

790 800 810 820 330 840 

************ 
ACTTATATCTTGTGTTTGGTTGTTTTTCTTTTTCATATAAAAACACACACCAAGACACAA 

850 860 870 880 390 900 

************ 
AGGTAT7GAGAGGTAAGTAGAGGGAAAGACCCTTTGGTTAGCATATTGTTTGTAGCATTG 

910 920 930 940 950 960 



* * * 



* 



GGTTTTTTCTCAAGGAAGAAGAAGGAGAAAGATAAGTACTTTTTTTGAGAATGATGGAAT 

>EcoRl 
I 

970 980 990 1000 1010 | 1020 

* ********* ** 
CTGGGGTTCCTGTTTGCCACACTTGTGGTGAACATGTTGGGTTGAATGTAAGCCGAATTC 

>Spel >BamHl >Asp718 

I I I 

1030 1040 11050 1060 

* * * * *|* *|* 
CAGCACACTGGCGGCCGTTACTAGTGGATCCGCGCTCGGTACC 
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